Graph matching: filtering databases of graphs using machine learning techniques

نویسنده

  • Christophe-André Irniger
چکیده

Graphs are a powerful concept useful for various tasks in science and engineering. In applications such as pattern recognition and information retrieval, object similarity is an important issue. If graphs are used for object representation, then the problem of determining the similarity of objects turns into the problem of graph matching. Some of the most common graph matching paradigms include graph and subgraph isomorphism detection, maximum common subgraph extraction and error-tolerant graph matching. A number of solutions for all of these tasks have been proposed in the literature, but they all suffer from the high computational complexity inherent to graph matching. An additional problem arises in applications where an input graph is to be matched not only to another single graph, but to an entire database of graphs under a given matching paradigm. If the database is large, sequential comparison of the input graph with each graph from the database using conventional approaches becomes infeasible. In this thesis the comparison of input graphs with databases of graphs is studied. Different retrieval paradigms, namely graph isomorphism, subgraph isomorphism, and error-tolerant matching are considered. The approach pursued is based on comparing feature vectors which have been extracted from the graphs. The idea is to use features that can be quickly computed from a graph on the one hand, but are, on the other hand, effective in discriminating between the various graphs in the database. Given a potentially large number of such features, the most powerful ones for discriminating the graphs in the database are determined by means of decision tree induction algorithms as known from machine learning. Under the proposed procedure, given an input, i.e. a query graph, a (preferably small) subset of possible candidates will be retrieved from the database. Only the graphs contained in this subset are then subject to full-fledged, expensive graph matching. Significant savings in computation time can be expected as the time complexity of graph feature extraction and decision tree traversal is small compared to full graph matching. In this work database filters for three different matching paradigms (graph isomorphism, subgraph isomorphism and error-tolerant matching) have been developed. They have been successfully tested on synthetically generated graphs with various characteristics as well as on publicly available real-world graph databases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comprehensive Analysis of Dense Point Cloud Filtering Algorithm for Eliminating Non-Ground Features

Point cloud and LiDAR Filtering is removing non-ground features from digital surface model (DSM) and reaching the bare earth and DTM extraction. Various methods have been proposed by different researchers to distinguish between ground and non- ground in points cloud and LiDAR data. Most fully automated methods have a common disadvantage, and they are only effective for a particular type of surf...

متن کامل

Matching Integral Graphs of Small Order

In this paper, we study matching integral graphs of small order. A graph is called matching integral if the zeros of its matching polynomial are all integers. Matching integral graphs were first studied by Akbari, Khalashi, etc. They characterized all traceable graphs which are matching integral. They studied matching integral regular graphs. Furthermore, it has been shown that there is no matc...

متن کامل

Large Scale Graph Matching(LSGM): Techniques, Tools, Applications and Challenges

Large Scale Graph Matching (LSGM) is one of the fundamental problems in Graph theory and it has applications in many areas such as Computer Vision, Machine Learning, Pattern Recognition and Big Data Analytics (Data Science). Matching belongs to the combinatorial class of problems which refers to finding correspondence between the nodes of a graph or among set of graphs (subgraphs) either precis...

متن کامل

Low-Rank Coding with b-Matching Constraint for Semi-Supervised Classification

Graph based semi-supervised learning (GSSL) plays an important role in machine learning systems. The most crucial step in GSSL is graph construction. Although several interesting graph construction methods have been proposed in recent years, how to construct an effective graph is still an open problem. In this paper, we develop a novel approach to constructing graph, which is based on low-rank ...

متن کامل

Spam Filtering using Contextual Network Graphs

This document describes a machine-learning solution to the spam-filtering problem. Spam-filtering is treated as a text-classification problem in very high dimension space. Two new text-classification algorithms, Latent Semantic Indexing (LSI) and Contextual Network Graphs (CNG) are compared to existing Bayesian techniques by monitoring their ability to process and correctly classify a series of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005